【科技宣言】科研评价的旧金山宣言——让科研评价更加科学
本文来自旧金山科研评价宣言小组(The SanFrancisco Declaration on Research Assessment ,简称DORA),旨在优化科研评价的输出。宣言于2012年12月份形成初稿,本文是宣言的最终版本。
当前迫切需要改善资助机构、学术研究机构及其他各方用以评价科学研究产出的方法。
为了解决这一问题,2012年12月16日,在美国细胞生物学学会(ASCB)年会期间,一些学术期刊的编辑和出版者举行了会晤,提出了一系列建议。在此基础上形成《关于科研评价的旧金山宣言》。我们邀请所有科学领域中感兴趣的各方签署本宣言,表明对此的支持。
科学研究的产出是多种多样的,包括:报道新知识的研究论文、数据、试剂和软件;知识产权;以及经过严格训练的青年科学家。资助机构、聘用科学家的研究机构以及科学家自己,都希望、也需要去评价科学产出的质量与影响。因此,对科学产出进行准确测度和理智评价是非常必要的。
期刊影响因子被频繁地用作比较个人和机构科学产出的基本参数。由汤森路透公司计算的期刊影响因子,最初是作为帮助图书馆员确定购买哪些期刊的工具的,并不是测度研究论文科学质量的指标。鉴于此,需要清晰地了解如同很多文献分析过的期刊影响因子作为科研评价工具时存在的缺陷。这些缺陷包括:
(A)期刊的引文分布是高度偏态的;
(B)期刊影响因子的特性随领域而不同:它覆盖多种多样的论文类型,包括原始研究论文与评述;
(C)期刊影响因子可以通过编辑政策来被操纵(甚至“制造”);
(D)用于计算期刊影响因子的数据对于公众来说既不透明也不公开。
我们将提出改善科研产出评价方法的一系列建议。在未来,研究论文之外的其他产出对于评价科研成效将越来越重要。但是,经过同行评议的研究论文仍将是科研评价所使用的核心研究产出。我们的建议首先是针对同行评议期刊研究论文相关的评价工作的,但也可以并应该扩展到作为重要研究产出而被承认的其他产品上,如数据集(datasets)。这些建议是针对资助机构、学术研究机构、期刊、计量指标提供机构和科研人员个人的。
这些建议贯穿着以下主题:
l 在考虑资助、聘用和晋升时,要停止使用基于期刊的计量指标,如期刊影响因子;
l 要评估科研工作本身的价值,而不是基于研究成果所登载的期刊;
l 要充分利用在线出版所提供的机会(例如,放宽对论文字数、图表和参考文献数量的不必要限制,开发新的重要度和影响指标)。
我们发现,许多资助机构、研究机构、出版机构和科研人员已经在鼓励改善科研评价的工作,已形成积极的态势推动制定更加完善和有益的科研评价方法。现在,所有的关键相关方应共同制定并采纳这样的方法。
《关于科研评价的旧金山宣言》的各签署方支持在科研评价采取以下行动:
总体建议:
1、停止用基于期刊的计量指标(如影响因子)作为替代指标来测度单个研究文献的质量,或用以评估科学家个体的贡献,或作为聘用、晋升或资助等方面的决策依据。
对资助机构的建议:
2、明示用于评估资助申请者科学生产力的标准,并明确强调(尤其是对于早期生涯的研究人员),一篇文献的科学内容要比出版物计量指标或所发表期刊的地位重要得多。
3、针对科研评估,应考虑除了研究论文以外的所有研究产出(包括数据集和软件)的价值与影响,并考虑更广泛的影响测度方法,包括对研究影响的定性指标,如对政策和实践的影响。
对研究机构的建议:
4、明示在聘用、终身聘用、晋升决策中所使用的标准,明确强调(尤其是对于早期生涯的研究人员),一篇文献的科学内容要比出版物计量指标或所发表期刊的地位重要得多。
5、针对科研评估,应考虑除了研究论文以外的所有研究产出(包括数据集和软件)的价值和影响,并考虑更广泛的影响测度方法,包括对研究影响的定性指标,如对政策和实践的影响。
对出版机构的建议:
6、大幅减少将期刊影响因子作为重点促销工具的行为。理想地,应该停止推广影响因子,或者只在众多基于期刊的计量指标集合(如,5年期影响因子、特征因子、SCImago期刊排名、H指数、编辑与出版时间等)中加入这一指标,这样可以以更多维的视角来评价期刊的表现。
7、提供基于文章的计量指标,促使转向基于文献科学内容的评估,而不是基于所发表期刊的出版物计量指标。
8、鼓励作者责任化的署名做法,提供有关每一位作者特定贡献的信息;
9、不论一种期刊是开放获取还是订阅的,取消对于研究论文参考文献列表进行再利用的限制,按照“创作共用公共领域使用协议”(Creative Commons Public Domain Dedication)授权公开利用。
10、取消或减少对研究论文参考文献数量的限制。在任何可行情况下,要求引用原始文献而不是第二手评述文献,以便把贡献归功于首次报道科研成果的团队。
对计量指标提供机构的建议:
11、开放透明地提供用于计算所有计量指标的数据与方法。
12、如果可能,许可对计量数据的无限制再利用,并提供计算机可读形式的计量数据。
13、要明确对于计量指标的不适当操纵是不能被容忍的;还要明确什么样的行为将构成不适当操纵,为打击不适当操纵将采取什么样的措施。
14、在进行对计量指标的使用、汇总统计和比较时,对于不同的文献类型(如评述与研究论文)及学科领域要予以说明。
对科研人员的建议:
15、在参加与资助、聘用、终身聘用或晋升有关的决策委员会时,要基于科学内容而不是期刊计量指标进行评价。
16、在任何可行情况下,要引用首次报道科研成果的原始文献而不是评述文献,以便把贡献归功于应该享有它的人。
17、在个人陈述和支持陈述中要使用多种关于论文的计量指标,以证明个人发表论文或其他研究产出所产生的影响。
18、要反对不适当地依赖期刊影响因子作为科研评估指标的做法,推广和传播那些关注具体科研产出的价值与影响的最佳实践。
(旧金山科研评价宣言小组著 中国科学院国家科学图书馆 李宏,王建芳编译)
附原文:
San Francisco Declaration on Research Assessment
Putting science into the assessment of research
There is a pressing need to improve the ways in which the output of scientific research is evaluated by funding agencies, academic institutions, and other parties.
To address this issue, the group of editors and publishers of scholarly journals listed below met during the Annual Meeting of The American Society for Cell Biology (ASCB) in San Francisco, CA, on December 16, 2012. The group developed a set of recommendations, referred to as the San Francisco Declaration on Research Assessment. We invite interested parties to indicate their support by adding their names to this declaration.
The outputs from scientific research are many and varied, including: research articles reporting new knowledge, data, reagents, and software; intellectual property; and highly trained young scientists. Funding agencies, institutions that employ scientists, and scientists themselves all have a desire, and need, to assess the quality and impact of scientific outputs. It is imperative that scientific output be measured accurately, evaluated wisely, and used thoughtfully.
The Journal Impact Factor is frequently used as the primary parameter with which to measure the scientific output of individuals and institutions. The Journal Impact Factor, as calculated by Thomson Reuters, was originally created as a tool to help librarians identify journals to purchase, not as a measure of the scientific quality of research in an article.
With that in mind, it is critical to understand that the Journal Impact Factor has a number of well-documented deficiencies as a tool for research assessment. These limitations include: A) citation distributions within journals are highly skewed [1]; B) the properties of the Journal Impact Factor are field-specific; it is a composite of multiple, highly diverse article types, including primary research papers and reviews [2]; C) Impact Factors can be manipulated (or “gamed”) by editorial policy [3]; and D) data used to calculate the Journal Impact Factors are neither transparent nor openly available to the public [4,5].
Below we make a number of recommendations for improving the way in which the actual quality of research output is evaluated. Outputs other than research articles will grow in importance in assessing research effectiveness in the future, but the peer-reviewed research paper will remain a central research output that informs research assessment. Our recommendations therefore focus primarily on practices relating to research articles published in peer-reviewed journals, but can and should be extended by recognising additional products, such as datasets, as important research outputs.
The recommendations are aimed at funding agencies, academic institutions, journals, organizations that supply metrics, and individual researchers.
A number of themes run through these recommendations:
-
the need to eliminate use of journal-based metrics, such as impact factors, in funding, appointment and promotion considerations.
-
the need to assess research on its own merits rather than on the basis of the journal in which the research is published, and
-
the need to capitalize on the opportunities provided by online publication (such as relaxing unnecessary limits on the number of words, figures, and references in articles, and exploring new indicators of significance and impact)
We recognize that many funding agencies, institutions, publishers, and researchers are already encouraging improved practices in research assessment. Such steps are beginning to increase the momentum toward more sophisticated and meaningful approaches to research evaluation that can now be built upon and adopted by all of the key constituencies involved.
The signatories of the San Francisco Declaration support the adoption of the following practices in research assessment.
General Recommendation
1. Do not use journal-based metrics, such as journal impact factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion or funding decisions.
For funding agencies
2. Be explicit about the criteria used in evaluating the scientific productivity of grant applicants and clearly highlight, especially for early-stage investigators, that the scientific content of a paper is much more important than publication metrics or the identity of the journal in which it was published.
3. For the purposes of research assessment, consider the value and impact of all research outputs (including datasets and software) in addition to research publications, and consider a broad range of impact measures including qualitative indicators of research impact, such as influence on policy and practice.
For institutions
4. Be explicit about the criteria used to reach hiring, tenure, and promotion decisions, clearly highlighting, especially for early-stage investigators, that the scientific content of a paper is much more important than publication metrics or the identity of the journal in which it was published.
5. For the purposes of research assessment, consider the value and impact of all research outputs (including datasets and software) in addition to research publications, and consider a broad range of impact measures including qualitative indicators of research impact, such as influence on policy and practice.
For publishers
6. Greatly reduce emphasis on the journal impact factor as a promotional tool, ideally by ceasing to promote the impact factor or by presenting the metric in the context of a variety of journal-based metrics (eg. 5-year impact factor, EigenFactor [6], SCImago [7], editorial and publication times, etc) that provide a richer view of journal performance.
7. Make available a range of article-level metrics to encourage a shift toward assessment based on the scientific content of an article rather than publication metrics of the journal in which it was published.
8. Encourage responsible authorship practices and the provision of information about the specific contributions of each author.
9. Whether a journal is open-access or subscription-based, remove all reuse limitations on reference lists in research articles and make them available under the Creative Commons Public Domain Dedication. (See reference 8.)
10. Remove or reduce the constraints on the number of references in research articles, and, where appropriate, mandate the citation of primary literature in favor of reviews in order to give credit to the group(s) who first reported a finding.
For organizations that supply metrics
11. Be open and transparent by providing data and methods used to calculate all metrics.
12. Provide the data under a licence that allows unrestricted reuse, and provide computational access to data.
13. Be clear that inappropriate manipulation of metrics will not be tolerated; be explicit about what constitutes inappropriate manipulation and what measures will be taken to combat this.
14. Account for the variation in article types (e.g., reviews versus research articles), and in different subject areas when metrics are used, aggregated, or compared
For researchers
15. When involved in committees making decisions about funding, hiring, tenure, or promotion, make assessments based on scientific content rather than publication metrics.
16. Wherever appropriate, cite primary literature in which observations are first reported rather than reviews in order to give credit where credit is due.
17. Use a range of article metrics and indicators on personal/supporting statements, as evidence of the impact of individual published articles and other research outputs [9].
18. Challenge research assessment practices that rely inappropriately on Journal Impact Factors and promote best practice that focuses on the value and influence of specific research outputs.